How I won the "Chess Ratings - Elo vs the Rest of the World" Competition
نویسنده
چکیده
This article discusses in detail the rating system that won the kaggle competition “Chess Ratings: Elo vs the rest of the world”. The competition provided a historical dataset of outcomes for chess games, and aimed to discover whether novel approaches can predict the outcomes of future games, more accurately than the well-known Elo rating system. The rating system, called Elo++ in the rest of the article, builds upon the Elo rating system. Like Elo, Elo++ uses a single rating per player. It predicts the outcome of a game, by using a logistic curve over the difference in ratings of the players. The major component of Elo++ is a regularization technique that avoids overfitting. The dataset of chess games and outcomes is relatively small and one has to be careful not to draw “too many conclusions” out of the limited data. Overfitting seems to be a problem of many approaches tested in the competition. The leader-board of the competition was dominated by attempts that did a very good job on a small test dataset, but couldn’t generalize as well as Elo++ on the private hold-out dataset. The Elo++ regularization takes into account the number of games per player, the recency of these games and the ratings of the opponents. Finally, Elo++ employs a stochastic gradient descent scheme for training the ratings.
منابع مشابه
Sex differences in chess performance: Analyzing participation rates, age, and practice in chess tournaments ¬リニ¬リニ¬リニ
a r t i c l e i n f o This study analyzed sex differences in chess Elo ratings with chess tournament data. We evaluated whether sex differences were due to differential participation rates of males and females, and whether age and practice were able to predict differences in chess ability. There were meaningful sex differences in Elo ratings unrelated to different participation rates. Age and p...
متن کاملBias in the ELO‒system of online chess
ELO is the key performance indicator in chess, a global and worldwide measure of chess skills and strength of chess play. Since their invention, ELO-systems are intended to be comparable and unbiased, so that chess players can know their level of play and can compare it among systems and internationally. Moreover, ELO is the defining feature of grandmasters with legal implications. Thus, it is ...
متن کاملPredicting the Outcome of Chess Games based on Historical Data
This report describes an approach used in the competition “Chess Ratings – Elo versus the Rest of the World” that took place between August and November 2010. Using the Bradley-Terry model as starting point, we postulate that the strength of each player can be approximated by the expected score against a common but unknown reference player, whose strength is never actually computed. This approa...
متن کاملThe Impact of Search Depth on Chess Playing Strength
How deep does a chess Grandmaster think? This question has been asked many times, and yet there is hardly a definite answer. Raw depth and pure calculation are certainly not the only factors in the thinking process of a chess player, but it would be interesting to know more about the relationship between search depth and playing strength, so that the strength of a given player (which is usually...
متن کاملIntrinsic Chess Ratings
This paper develops and tests formulas for representing playing strength at chess by the quality of moves played, rather than by the results of games. Intrinsic quality is estimated via evaluations given by computer chess programs run to high depth, ideally so that their playing strength is sufficiently far ahead of the best human players as to be a ‘relatively omniscient’ guide. Several formul...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1012.4571 شماره
صفحات -
تاریخ انتشار 2010